20 research outputs found
Simplicial Data Analysis: theory, practice, and algorithms
Simplicial complexes store in discrete form key information on a topological space, and have been used in mathematics to introduce combinatorial and discrete tools in geometry and topology. They represent a topological space as a collection of ‘simple elements’ (such as vertices, edges, triangles, tetrahedra, and more general simplices) that are glued to each other in a structured manner. In the last 40 years, they have been a basic tool in computer visualization for storing and classifying different shapes of 3d images, then in the early 2000s these techniques were success- fully applied to more general data, not necessarily sampled from a metric space.
The use of techniques borrowed from algebraic topology has been very successfull in analysing data from various fields: genomics, sensor analysis, brain connectomics, fMRI data, trade net- works, and new fields of application are being tested every day. Regrettably, topological data analysis has been used mainly as a qualitative method, the problem being the lack of proper tools to perform effective statistical analysis.
Coming from well established techniques in random graph theory, the first models for random simplicial complexes have been introduced in recent years, none of which though can be used effectively in a quantitative analysis of data. We introduce a model that can be successfully used as a null model for simplicial complexes as it fixes the size distribution of facets.
Another challenge is to successfully identify a simplicial complex which can correctly encode the topological space from which the initial data set is sampled. The most common solution is to build nesting simplicial complexes, and study the evolution of their features. A recent study uncovered that the problem can reside in making wrong assumption on the space of data. We propose a categorical reasoning which enlightens the cause leading to these misconceptions. We introduce a new category for weighted graphs and study its relation to other common categories when the weights are chosen in a poset.
The construction of the appropriate simplicial complex is not the only obstacle one faces when applying topological methods to real data. Available algorithms for homological features extraction have a memory and time complexity which scales exponentially on the number of simplices, making these techniques not suitable for the analysis of ‘big data’. We propose a quantum algorithm which is able to track in logarithmic time the evolution of a quantum version of well known homological features along a filtration of simplicial complexes
-persistent homology of finite topological spaces
Let be a finite poset. We will show that for any reasonable
-persistent object in the category of finite topological spaces, there
is a weighted graph, whose clique complex has the same -persistent
homology as
Construction of and efficient sampling from the simplicial configuration model
Simplicial complexes are now a popular alternative to networks when it comes
to describing the structure of complex systems, primarily because they encode
multi-node interactions explicitly. With this new description comes the need
for principled null models that allow for easy comparison with empirical data.
We propose a natural candidate, the simplicial configuration model. The core of
our contribution is an efficient and uniform Markov chain Monte Carlo sampler
for this model. We demonstrate its usefulness in a short case study by
investigating the topology of three real systems and their randomized
counterparts (using their Betti numbers). For two out of three systems, the
model allows us to reject the hypothesis that there is no organization beyond
the local scale.Comment: 6 pages, 4 figure
the shape of collaborations
Abstract The structure of scientific collaborations has been the object of intense study both for its importance for innovation and scientific advancement, and as a model system for social group coordination and formation thanks to the availability of authorship data. Over the last years, complex networks approach to this problem have yielded important insights and shaped our understanding of scientific communities. In this paper we propose to complement the picture provided by network tools with that coming from using simplicial descriptions of publications and the corresponding topological methods. We show that it is natural to extend the concept of triadic closure to simplicial complexes and show the presence of strong simplicial closure. Focusing on the differences between scientific fields, we find that, while categories are characterized by different collaboration size distributions, the distributions of how many collaborations to which an author is able to participate is conserved across fields pointing to underlying attentional and temporal constraints. We then show that homological cycles, that can intuitively be thought as hole in the network fabric, are an important part of the underlying community linking structure
P-persistent homology of finite topological spaces
Let P be a finite poset. We will show that for any reasonable P-persistent object X in the category of finite topological spaces, there is a P− weighted graph, whose clique complex has the same P-persistent homology as X
Exact and rapid linear clustering of networks with dynamic programming
We study the problem of clustering networks whose nodes have imputed or
physical positions in a single dimension, such as prestige hierarchies or the
similarity dimension of hyperbolic embeddings. Existing algorithms, such as the
critical gap method and other greedy strategies, only offer approximate
solutions. Here, we introduce a dynamic programming approach that returns
provably optimal solutions in polynomial time -- O(n^2) steps -- for a broad
class of clustering objectives. We demonstrate the algorithm through
applications to synthetic and empirical networks, and show that it outperforms
existing heuristics by a significant margin, with a similar execution time.Comment: 13 pages, 8 figure
Topological Learning for Motion Data via Mixed Coordinates
Topology can extract the structural information in a dataset efficiently. In
this paper, we attempt to incorporate topological information into a multiple
output Gaussian process model for transfer learning purposes. To achieve this
goal, we extend the framework of circular coordinates into a novel framework of
mixed valued coordinates to take linear trends in the time series into
consideration.
One of the major challenges to learn from multiple time series effectively
via a multiple output Gaussian process model is constructing a functional
kernel. We propose to use topologically induced clustering to construct a
cluster based kernel in a multiple output Gaussian process model. This kernel
not only incorporates the topological structural information, but also allows
us to put forward a unified framework using topological information in time and
motion series.Comment: 7 pages, 4 figure
Complex Politics: A Quantitative Semantic and Topological Analysis of UK House of Commons Debates
This study is a first, exploratory attempt to use quantitative semantics
techniques and topological analysis to analyze systemic patterns arising in a
complex political system. In particular, we use a rich data set covering all
speeches and debates in the UK House of Commons between 1975 and 2014. By the
use of dynamic topic modeling (DTM) and topological data analysis (TDA) we show
that both members and parties feature specific roles within the system,
consistent over time, and extract global patterns indicating levels of
political cohesion. Our results provide a wide array of novel hypotheses about
the complex dynamics of political systems, with valuable policy applications